Removing Batch Effects from Longitudinal Gene Expression - Quantile Normalization Plus ComBat as Best Approach for Microarray Transcriptome Data
نویسندگان
چکیده
Technical variation plays an important role in microarray-based gene expression studies, and batch effects explain a large proportion of this noise. It is therefore mandatory to eliminate technical variation while maintaining biological variability. Several strategies have been proposed for the removal of batch effects, although they have not been evaluated in large-scale longitudinal gene expression data. In this study, we aimed at identifying a suitable method for batch effect removal in a large study of microarray-based longitudinal gene expression. Monocytic gene expression was measured in 1092 participants of the Gutenberg Health Study at baseline and 5-year follow up. Replicates of selected samples were measured at both time points to identify technical variability. Deming regression, Passing-Bablok regression, linear mixed models, non-linear models as well as ReplicateRUV and ComBat were applied to eliminate batch effects between replicates. In a second step, quantile normalization prior to batch effect correction was performed for each method. Technical variation between batches was evaluated by principal component analysis. Associations between body mass index and transcriptomes were calculated before and after batch removal. Results from association analyses were compared to evaluate maintenance of biological variability. Quantile normalization, separately performed in each batch, combined with ComBat successfully reduced batch effects and maintained biological variability. ReplicateRUV performed perfectly in the replicate data subset of the study, but failed when applied to all samples. All other methods did not substantially reduce batch effects in the replicate data subset. Quantile normalization plus ComBat appears to be a valuable approach for batch correction in longitudinal gene expression data.
منابع مشابه
Removing Batch Effects in Analysis of Expression Microarray Data: An Evaluation of Six Batch Adjustment Methods
The expression microarray is a frequently used approach to study gene expression on a genome-wide scale. However, the data produced by the thousands of microarray studies published annually are confounded by "batch effects," the systematic error introduced when samples are processed in multiple batches. Although batch effects can be reduced by careful experimental design, they cannot be elimina...
متن کاملStatistical Applications in Genetics and Molecular Biology
Normalization of expression levels applied to microarray data can help in reducing measurement error. Different methods, including cyclic loess, quantile normalization and median or mean normalization, have been utilized to normalize microarray data. Although there is considerable literature regarding normalization techniques for mRNA microarray data, there are no publications comparing normali...
متن کاملپاسخ متفاوت سلولهای قلبی به اسیدهای چرب اشباع و غیر اشباع
Introduction & Objective: The link between dietary fat and coronary heart disease has attracted much attention since the effect of long?chain fatty acids (LCFA) on gene transcription has been established, which in part, these effects can be explained by the regulation of gene transcription. In this study, the P19CL6 cardiac cell?line was targeted for the investigation of (i) the effects of long...
متن کاملARRm: Adaptive Robust Regression method for normalization of methylation data
With the adaptation of microarray hybridization techniques developed for gene expression and genomics studies to methylation data, there has been a revolution in the development of DNA methylation profiling techniques [1]. Illumina recently released the Infinium HumanMethylation450 BeadChip, a single CpG site resolution array using bisulfite-converted DNA. The Infinium 450k methylation array co...
متن کاملEnhanced quantile normalization of microarray data to reduce loss of information in gene expression profiles.
In microarray experiments, removal of systematic variations resulting from array preparation or sample hybridization conditions is crucial to ensure sensible results from the ensuing data analysis. For example, quantile normalization is routinely used in the treatment of both oligonucleotide and cDNA microarray data, even though there might be some loss of information in the normalization proce...
متن کامل